Unit 6: Text Data Processing 


Data Services prepares data for query, analytics, and reporting. Benefits of the native 
integration of text analytics include: 


e Enhanced Business Insight: text data processing (TDP) extends Enterprise Information 
Management and Business Intelligence handling of unstructured data. TDP improves the 
quality of analysis and reporting, and contributes to competitive advantage. 


e Improved Governance: TDP can be part of improved transparency and oversight by 
monitoring. 


e Increased Productivity: TDP reduces the number of errors caused by manual data entry, 
improving efficiency and reducing costs. 


Content is pulled from sources such as notes fields, file systems, spreadsheets, or other 
repositories. The current release supports text documents that are HTML, TXT, or XML. TDP 
and Data Services then parse the text to extract meaning from it and transform it into 
structured data that can be integrated into a database. Once in the database, Business 
Intelligence (BI) tools can be used to support query, analysis, and reporting on that text data. 


Text Data Processing Extractions and Languages 
e Entity Extraction transform extracts the following data: 
Predefined entities such as company, person, name, street, city, country, and so on 


Sentiment analysis such as strong positive, weak positive, neutral, weak negative, 
strong negative 


Custom entities, customized using dictionaries 
e Supported languages include the following: 

English 

French 

German 

Japanese 

Simplified Chinese 


Over 25 other languages 
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